Overview
Brought to you by YData
Dataset statistics
| Number of variables | 19 |
|---|---|
| Number of observations | 96 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 15.0 KiB |
| Average record size in memory | 160.0 B |
Variable types
| Numeric | 17 |
|---|---|
| Categorical | 2 |
subtokenization_indicator_min has constant value "1.0" | Constant |
GOODS_DESCRIPTION_len_chars_max is highly overall correlated with GOODS_DESCRIPTION_len_chars_mean and 11 other fields | High correlation |
GOODS_DESCRIPTION_len_chars_mean is highly overall correlated with GOODS_DESCRIPTION_len_chars_max and 12 other fields | High correlation |
GOODS_DESCRIPTION_len_chars_median is highly overall correlated with GOODS_DESCRIPTION_len_chars_mean and 7 other fields | High correlation |
GOODS_DESCRIPTION_len_chars_min is highly overall correlated with GOODS_DESCRIPTION_len_chars_max and 7 other fields | High correlation |
GOODS_DESCRIPTION_len_chars_std is highly overall correlated with GOODS_DESCRIPTION_len_chars_max and 4 other fields | High correlation |
GOODS_DESCRIPTION_len_chars_sum is highly overall correlated with GOODS_DESCRIPTION_len_chars_max and 10 other fields | High correlation |
GOODS_DESCRIPTION_len_words_max is highly overall correlated with GOODS_DESCRIPTION_len_chars_max and 11 other fields | High correlation |
GOODS_DESCRIPTION_len_words_mean is highly overall correlated with GOODS_DESCRIPTION_len_chars_max and 11 other fields | High correlation |
GOODS_DESCRIPTION_len_words_median is highly overall correlated with GOODS_DESCRIPTION_len_chars_max and 9 other fields | High correlation |
GOODS_DESCRIPTION_len_words_min is highly overall correlated with GOODS_DESCRIPTION_len_chars_min | High correlation |
GOODS_DESCRIPTION_len_words_std is highly overall correlated with GOODS_DESCRIPTION_len_chars_max and 4 other fields | High correlation |
GOODS_DESCRIPTION_len_words_sum is highly overall correlated with GOODS_DESCRIPTION_len_chars_max and 10 other fields | High correlation |
HS06_count is highly overall correlated with GOODS_DESCRIPTION_len_chars_max and 9 other fields | High correlation |
subtokenization_indicator_max is highly overall correlated with GOODS_DESCRIPTION_len_chars_max and 12 other fields | High correlation |
subtokenization_indicator_mean is highly overall correlated with GOODS_DESCRIPTION_len_chars_mean and 4 other fields | High correlation |
subtokenization_indicator_median is highly overall correlated with subtokenization_indicator_mean and 1 other fields | High correlation |
subtokenization_indicator_std is highly overall correlated with subtokenization_indicator_max and 2 other fields | High correlation |
subtokenization_indicator_sum is highly overall correlated with GOODS_DESCRIPTION_len_chars_max and 10 other fields | High correlation |
GOODS_DESCRIPTION_len_words_min is highly imbalanced (79.9%) | Imbalance |
GOODS_DESCRIPTION_len_words_sum has unique values | Unique |
GOODS_DESCRIPTION_len_words_std has unique values | Unique |
GOODS_DESCRIPTION_len_chars_mean has unique values | Unique |
GOODS_DESCRIPTION_len_chars_std has unique values | Unique |
subtokenization_indicator_sum has unique values | Unique |
subtokenization_indicator_mean has unique values | Unique |
subtokenization_indicator_std has unique values | Unique |
Reproduction
| Analysis started | 2025-05-15 18:03:21.035223 |
|---|---|
| Analysis finished | 2025-05-15 18:05:11.917202 |
| Duration | 1 minute and 50.88 seconds |
| Software version | ydata-profiling vv4.12.1 |
| Download configuration | config.json |
Variables
HS06_count
Real number (ℝ)
High correlation 
| Distinct | 89 |
|---|---|
| Distinct (%) | 92.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2789.375 |
| Minimum | 5 |
|---|---|
| Maximum | 54901 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 5 |
|---|---|
| 5-th percentile | 14 |
| Q1 | 121.25 |
| median | 527.5 |
| Q3 | 2109.25 |
| 95-th percentile | 11762.75 |
| Maximum | 54901 |
| Range | 54896 |
| Interquartile range (IQR) | 1988 |
Descriptive statistics
| Standard deviation | 7357.814 |
|---|---|
| Coefficient of variation (CV) | 2.6378002 |
| Kurtosis | 30.409234 |
| Mean | 2789.375 |
| Median Absolute Deviation (MAD) | 507 |
| Skewness | 5.1538992 |
| Sum | 267780 |
| Variance | 54137426 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 317 | 3 | 3.1% |
| 11 | 2 | 2.1% |
| 293 | 2 | 2.1% |
| 15 | 2 | 2.1% |
| 19 | 2 | 2.1% |
| 468 | 2 | 2.1% |
| 3793 | 1 | 1.0% |
| 2422 | 1 | 1.0% |
| 1533 | 1 | 1.0% |
| 233 | 1 | 1.0% |
| Other values (79) | 79 |
| Value | Count | Frequency (%) |
| 5 | 1 | |
| 6 | 1 | |
| 10 | 1 | |
| 11 | 2 | |
| 15 | 2 | |
| 19 | 2 | |
| 22 | 1 | |
| 27 | 1 | |
| 32 | 1 | |
| 34 | 1 |
| Value | Count | Frequency (%) |
| 54901 | 1 | |
| 33571 | 1 | |
| 28476 | 1 | |
| 16173 | 1 | |
| 12218 | 1 | |
| 11611 | 1 | |
| 7972 | 1 | |
| 7921 | 1 | |
| 7526 | 1 | |
| 4285 | 1 |
GOODS_DESCRIPTION_len_words_sum
Real number (ℝ)
High correlation  Unique 
| Distinct | 96 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 12799.042 |
| Minimum | 15 |
|---|---|
| Maximum | 256770 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 15 |
|---|---|
| 5-th percentile | 34.5 |
| Q1 | 608.5 |
| median | 2135.5 |
| Q3 | 9261.75 |
| 95-th percentile | 53337.75 |
| Maximum | 256770 |
| Range | 256755 |
| Interquartile range (IQR) | 8653.25 |
Descriptive statistics
| Standard deviation | 34999.098 |
|---|---|
| Coefficient of variation (CV) | 2.7345092 |
| Kurtosis | 29.491323 |
| Mean | 12799.042 |
| Median Absolute Deviation (MAD) | 2052.5 |
| Skewness | 5.1288628 |
| Sum | 1228708 |
| Variance | 1.2249368 × 109 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 775 | 1 | 1.0% |
| 1558 | 1 | 1.0% |
| 1270 | 1 | 1.0% |
| 12460 | 1 | 1.0% |
| 11014 | 1 | 1.0% |
| 6693 | 1 | 1.0% |
| 779 | 1 | 1.0% |
| 641 | 1 | 1.0% |
| 2591 | 1 | 1.0% |
| 7700 | 1 | 1.0% |
| Other values (86) | 86 |
| Value | Count | Frequency (%) |
| 15 | 1 | |
| 20 | 1 | |
| 23 | 1 | |
| 27 | 1 | |
| 30 | 1 | |
| 36 | 1 | |
| 46 | 1 | |
| 52 | 1 | |
| 81 | 1 | |
| 85 | 1 |
| Value | Count | Frequency (%) |
| 256770 | 1 | |
| 157199 | 1 | |
| 150753 | 1 | |
| 73862 | 1 | |
| 53412 | 1 | |
| 53313 | 1 | |
| 36418 | 1 | |
| 33864 | 1 | |
| 33169 | 1 | |
| 18492 | 1 |
GOODS_DESCRIPTION_len_words_min
Categorical
High correlation  Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | 2.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 KiB |
| 1 | |
|---|---|
| 2 | 3 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 93 | |
| 2 | 3 | 3.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1 | 93 | |
| 2 | 3 | 3.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 93 | |
| 2 | 3 | 3.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 96 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 93 | |
| 2 | 3 | 3.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 96 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 93 | |
| 2 | 3 | 3.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 96 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 93 | |
| 2 | 3 | 3.1% |
GOODS_DESCRIPTION_len_words_mean
Real number (ℝ)
High correlation 
| Distinct | 95 |
|---|---|
| Distinct (%) | 99.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.1185112 |
| Minimum | 2.0909091 |
|---|---|
| Maximum | 6.4683544 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 2.0909091 |
|---|---|
| 5-th percentile | 2.8858234 |
| Q1 | 3.7481697 |
| median | 4.1198804 |
| Q3 | 4.5974651 |
| 95-th percentile | 5.2906959 |
| Maximum | 6.4683544 |
| Range | 4.3774453 |
| Interquartile range (IQR) | 0.8492954 |
Descriptive statistics
| Standard deviation | 0.76212921 |
|---|---|
| Coefficient of variation (CV) | 0.18504969 |
| Kurtosis | 0.72674148 |
| Mean | 4.1185112 |
| Median Absolute Deviation (MAD) | 0.47154938 |
| Skewness | -0.041199556 |
| Sum | 395.37708 |
| Variance | 0.58084093 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 3 | 2 | 2.1% |
| 6.007751938 | 1 | 1.0% |
| 3.845768521 | 1 | 1.0% |
| 4.613106257 | 1 | 1.0% |
| 4.54748142 | 1 | 1.0% |
| 4.365949119 | 1 | 1.0% |
| 3.343347639 | 1 | 1.0% |
| 4.056962025 | 1 | 1.0% |
| 3.760522496 | 1 | 1.0% |
| 4.05904059 | 1 | 1.0% |
| Other values (85) | 85 |
| Value | Count | Frequency (%) |
| 2.090909091 | 1 | |
| 2.4 | 1 | |
| 2.454545455 | 1 | |
| 2.589905363 | 1 | |
| 2.736842105 | 1 | |
| 2.935483871 | 1 | |
| 3 | 2 | |
| 3.066666667 | 1 | |
| 3.079545455 | 1 | |
| 3.109090909 | 1 |
| Value | Count | Frequency (%) |
| 6.46835443 | 1 | |
| 6.007751938 | 1 | |
| 5.550689376 | 1 | |
| 5.317406143 | 1 | |
| 5.294037084 | 1 | |
| 5.289582107 | 1 | |
| 5.250712251 | 1 | |
| 5.1125 | 1 | |
| 5.071669794 | 1 | |
| 5.071428571 | 1 |
GOODS_DESCRIPTION_len_words_median
Real number (ℝ)
High correlation 
| Distinct | 6 |
|---|---|
| Distinct (%) | 6.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.578125 |
| Minimum | 2 |
|---|---|
| Maximum | 6 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 3 |
| median | 4 |
| Q3 | 4 |
| 95-th percentile | 5 |
| Maximum | 6 |
| Range | 4 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.77909571 |
|---|---|
| Coefficient of variation (CV) | 0.21773854 |
| Kurtosis | 0.29014474 |
| Mean | 3.578125 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -0.044005382 |
| Sum | 343.5 |
| Variance | 0.60699013 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4 | 49 | |
| 3 | 30 | |
| 2 | 7 | 7.3% |
| 5 | 6 | 6.2% |
| 2.5 | 3 | 3.1% |
| 6 | 1 | 1.0% |
| Value | Count | Frequency (%) |
| 2 | 7 | 7.3% |
| 2.5 | 3 | 3.1% |
| 3 | 30 | |
| 4 | 49 | |
| 5 | 6 | 6.2% |
| 6 | 1 | 1.0% |
| Value | Count | Frequency (%) |
| 6 | 1 | 1.0% |
| 5 | 6 | 6.2% |
| 4 | 49 | |
| 3 | 30 | |
| 2.5 | 3 | 3.1% |
| 2 | 7 | 7.3% |
GOODS_DESCRIPTION_len_words_max
Real number (ℝ)
High correlation 
| Distinct | 33 |
|---|---|
| Distinct (%) | 34.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 18.447917 |
| Minimum | 3 |
|---|---|
| Maximum | 41 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 3 |
|---|---|
| 5-th percentile | 5.75 |
| Q1 | 13 |
| median | 17.5 |
| Q3 | 24 |
| 95-th percentile | 32.25 |
| Maximum | 41 |
| Range | 38 |
| Interquartile range (IQR) | 11 |
Descriptive statistics
| Standard deviation | 7.9204862 |
|---|---|
| Coefficient of variation (CV) | 0.42934312 |
| Kurtosis | -0.057594053 |
| Mean | 18.447917 |
| Median Absolute Deviation (MAD) | 5.5 |
| Skewness | 0.37294986 |
| Sum | 1771 |
| Variance | 62.734101 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 14 | 8 | 8.3% |
| 23 | 6 | 6.2% |
| 20 | 5 | 5.2% |
| 13 | 5 | 5.2% |
| 17 | 5 | 5.2% |
| 15 | 5 | 5.2% |
| 21 | 5 | 5.2% |
| 12 | 4 | 4.2% |
| 25 | 4 | 4.2% |
| 16 | 4 | 4.2% |
| Other values (23) | 45 |
| Value | Count | Frequency (%) |
| 3 | 1 | 1.0% |
| 4 | 1 | 1.0% |
| 5 | 3 | |
| 6 | 1 | 1.0% |
| 7 | 2 | |
| 8 | 1 | 1.0% |
| 9 | 3 | |
| 10 | 3 | |
| 11 | 2 | |
| 12 | 4 |
| Value | Count | Frequency (%) |
| 41 | 1 | 1.0% |
| 37 | 2 | |
| 34 | 1 | 1.0% |
| 33 | 1 | 1.0% |
| 32 | 1 | 1.0% |
| 31 | 1 | 1.0% |
| 29 | 1 | 1.0% |
| 28 | 3 | |
| 27 | 3 | |
| 26 | 4 |
GOODS_DESCRIPTION_len_words_std
Real number (ℝ)
High correlation  Unique 
| Distinct | 96 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.4903831 |
| Minimum | 0.53935989 |
|---|---|
| Maximum | 4.1689802 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 0.53935989 |
|---|---|
| 5-th percentile | 1.3730419 |
| Q1 | 2.1834948 |
| median | 2.5404806 |
| Q3 | 2.9078441 |
| 95-th percentile | 3.5554833 |
| Maximum | 4.1689802 |
| Range | 3.6296203 |
| Interquartile range (IQR) | 0.72434935 |
Descriptive statistics
| Standard deviation | 0.6249779 |
|---|---|
| Coefficient of variation (CV) | 0.25095653 |
| Kurtosis | 0.94486299 |
| Mean | 2.4903831 |
| Median Absolute Deviation (MAD) | 0.36738186 |
| Skewness | -0.38359552 |
| Sum | 239.07678 |
| Variance | 0.39059738 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 3.859841569 | 1 | 1.0% |
| 3.04727659 | 1 | 1.0% |
| 2.163082001 | 1 | 1.0% |
| 2.904064587 | 1 | 1.0% |
| 2.926473309 | 1 | 1.0% |
| 2.557651519 | 1 | 1.0% |
| 2.187870352 | 1 | 1.0% |
| 2.84021097 | 1 | 1.0% |
| 2.754881031 | 1 | 1.0% |
| 2.391902154 | 1 | 1.0% |
| Other values (86) | 86 |
| Value | Count | Frequency (%) |
| 0.53935989 | 1 | |
| 1.035725481 | 1 | |
| 1.055597326 | 1 | |
| 1.099783528 | 1 | |
| 1.247219129 | 1 | |
| 1.414982776 | 1 | |
| 1.446916463 | 1 | |
| 1.484174497 | 1 | |
| 1.531335925 | 1 | |
| 1.608083757 | 1 |
| Value | Count | Frequency (%) |
| 4.168980208 | 1 | |
| 3.859841569 | 1 | |
| 3.607145572 | 1 | |
| 3.575449727 | 1 | |
| 3.574472297 | 1 | |
| 3.549153593 | 1 | |
| 3.461727596 | 1 | |
| 3.155637188 | 1 | |
| 3.130722923 | 1 | |
| 3.130303308 | 1 |
GOODS_DESCRIPTION_len_chars_sum
Real number (ℝ)
High correlation 
| Distinct | 95 |
|---|---|
| Distinct (%) | 99.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 81515.052 |
| Minimum | 95 |
|---|---|
| Maximum | 1657412 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 95 |
|---|---|
| 5-th percentile | 204.75 |
| Q1 | 4016.5 |
| median | 14208.5 |
| Q3 | 55776.75 |
| 95-th percentile | 343578 |
| Maximum | 1657412 |
| Range | 1657317 |
| Interquartile range (IQR) | 51760.25 |
Descriptive statistics
| Standard deviation | 226820.03 |
|---|---|
| Coefficient of variation (CV) | 2.782554 |
| Kurtosis | 29.502104 |
| Mean | 81515.052 |
| Median Absolute Deviation (MAD) | 13616.5 |
| Skewness | 5.1507801 |
| Sum | 7825445 |
| Variance | 5.1447328 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 11397 | 2 | 2.1% |
| 4417 | 1 | 1.0% |
| 83074 | 1 | 1.0% |
| 8432 | 1 | 1.0% |
| 76719 | 1 | 1.0% |
| 67240 | 1 | 1.0% |
| 42107 | 1 | 1.0% |
| 4977 | 1 | 1.0% |
| 4261 | 1 | 1.0% |
| 15320 | 1 | 1.0% |
| Other values (85) | 85 |
| Value | Count | Frequency (%) |
| 95 | 1 | |
| 123 | 1 | |
| 148 | 1 | |
| 172 | 1 | |
| 204 | 1 | |
| 205 | 1 | |
| 261 | 1 | |
| 388 | 1 | |
| 494 | 1 | |
| 501 | 1 |
| Value | Count | Frequency (%) |
| 1657412 | 1 | |
| 1023987 | 1 | |
| 998884 | 1 | |
| 443120 | 1 | |
| 364062 | 1 | |
| 336750 | 1 | |
| 226624 | 1 | |
| 205919 | 1 | |
| 204300 | 1 | |
| 120224 | 1 |
GOODS_DESCRIPTION_len_chars_min
Real number (ℝ)
High correlation 
| Distinct | 9 |
|---|---|
| Distinct (%) | 9.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.9791667 |
| Minimum | 2 |
|---|---|
| Maximum | 10 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 3 |
| median | 4 |
| Q3 | 4 |
| 95-th percentile | 7.25 |
| Maximum | 10 |
| Range | 8 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.5826638 |
|---|---|
| Coefficient of variation (CV) | 0.39773749 |
| Kurtosis | 3.2515458 |
| Mean | 3.9791667 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.661975 |
| Sum | 382 |
| Variance | 2.5048246 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 3 | 34 | |
| 4 | 32 | |
| 2 | 9 | 9.4% |
| 5 | 8 | 8.3% |
| 6 | 6 | 6.2% |
| 8 | 2 | 2.1% |
| 7 | 2 | 2.1% |
| 9 | 2 | 2.1% |
| 10 | 1 | 1.0% |
| Value | Count | Frequency (%) |
| 2 | 9 | 9.4% |
| 3 | 34 | |
| 4 | 32 | |
| 5 | 8 | 8.3% |
| 6 | 6 | 6.2% |
| 7 | 2 | 2.1% |
| 8 | 2 | 2.1% |
| 9 | 2 | 2.1% |
| 10 | 1 | 1.0% |
| Value | Count | Frequency (%) |
| 10 | 1 | 1.0% |
| 9 | 2 | 2.1% |
| 8 | 2 | 2.1% |
| 7 | 2 | 2.1% |
| 6 | 6 | 6.2% |
| 5 | 8 | 8.3% |
| 4 | 32 | |
| 3 | 34 | |
| 2 | 9 | 9.4% |
GOODS_DESCRIPTION_len_chars_mean
Real number (ℝ)
High correlation  Unique 
| Distinct | 96 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 25.890457 |
| Minimum | 13.454545 |
|---|---|
| Maximum | 41.556962 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 13.454545 |
|---|---|
| 5-th percentile | 17.946774 |
| Q1 | 23.215338 |
| median | 25.811239 |
| Q3 | 28.466038 |
| 95-th percentile | 34.333797 |
| Maximum | 41.556962 |
| Range | 28.102417 |
| Interquartile range (IQR) | 5.2507007 |
Descriptive statistics
| Standard deviation | 4.8779838 |
|---|---|
| Coefficient of variation (CV) | 0.18840856 |
| Kurtosis | 0.8090321 |
| Mean | 25.890457 |
| Median Absolute Deviation (MAD) | 2.6208265 |
| Skewness | 0.096608964 |
| Sum | 2485.4839 |
| Variance | 23.794726 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 34.24031008 | 1 | 1.0% |
| 33.64163823 | 1 | 1.0% |
| 21.51020408 | 1 | 1.0% |
| 28.40392447 | 1 | 1.0% |
| 27.76218002 | 1 | 1.0% |
| 27.46705806 | 1 | 1.0% |
| 21.36051502 | 1 | 1.0% |
| 26.96835443 | 1 | 1.0% |
| 22.23512337 | 1 | 1.0% |
| 25.2398524 | 1 | 1.0% |
| Other values (86) | 86 |
| Value | Count | Frequency (%) |
| 13.45454545 | 1 | |
| 13.66666667 | 1 | |
| 15.63636364 | 1 | |
| 16.5615142 | 1 | |
| 17.4 | 1 | |
| 18.12903226 | 1 | |
| 18.60984848 | 1 | |
| 19 | 1 | |
| 19.70909091 | 1 | |
| 20.3881932 | 1 |
| Value | Count | Frequency (%) |
| 41.55696203 | 1 | |
| 36.00416667 | 1 | |
| 35.33495539 | 1 | |
| 35.07810086 | 1 | |
| 34.61425891 | 1 | |
| 34.24031008 | 1 | |
| 33.64163823 | 1 | |
| 32.58646617 | 1 | |
| 32.24543849 | 1 | |
| 31.91067538 | 1 |
GOODS_DESCRIPTION_len_chars_median
Real number (ℝ)
High correlation 
| Distinct | 26 |
|---|---|
| Distinct (%) | 27.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 22.463542 |
| Minimum | 9 |
|---|---|
| Maximum | 36 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 9 |
|---|---|
| 5-th percentile | 14.875 |
| Q1 | 20 |
| median | 22.25 |
| Q3 | 25.125 |
| 95-th percentile | 30 |
| Maximum | 36 |
| Range | 27 |
| Interquartile range (IQR) | 5.125 |
Descriptive statistics
| Standard deviation | 4.5400236 |
|---|---|
| Coefficient of variation (CV) | 0.20210632 |
| Kurtosis | 0.65863743 |
| Mean | 22.463542 |
| Median Absolute Deviation (MAD) | 2.75 |
| Skewness | -0.13618489 |
| Sum | 2156.5 |
| Variance | 20.611815 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 22 | 12 | |
| 24 | 10 | |
| 26 | 7 | 7.3% |
| 21 | 7 | 7.3% |
| 20 | 7 | 7.3% |
| 25 | 7 | 7.3% |
| 27 | 7 | 7.3% |
| 23 | 6 | 6.2% |
| 18 | 5 | 5.2% |
| 17 | 5 | 5.2% |
| Other values (16) | 23 |
| Value | Count | Frequency (%) |
| 9 | 1 | 1.0% |
| 12 | 1 | 1.0% |
| 13 | 1 | 1.0% |
| 14 | 1 | 1.0% |
| 14.5 | 1 | 1.0% |
| 15 | 2 | 2.1% |
| 16 | 1 | 1.0% |
| 17 | 5 | |
| 18 | 5 | |
| 19 | 4 |
| Value | Count | Frequency (%) |
| 36 | 1 | 1.0% |
| 32 | 1 | 1.0% |
| 31 | 1 | 1.0% |
| 30 | 3 | |
| 29 | 1 | 1.0% |
| 28 | 2 | 2.1% |
| 27 | 7 | |
| 26 | 7 | |
| 25.5 | 1 | 1.0% |
| 25 | 7 |
GOODS_DESCRIPTION_len_chars_max
Real number (ℝ)
High correlation 
| Distinct | 49 |
|---|---|
| Distinct (%) | 51.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 108.04167 |
| Minimum | 24 |
|---|---|
| Maximum | 150 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 24 |
|---|---|
| 5-th percentile | 36.75 |
| Q1 | 83.25 |
| median | 101 |
| Q3 | 150 |
| 95-th percentile | 150 |
| Maximum | 150 |
| Range | 126 |
| Interquartile range (IQR) | 66.75 |
Descriptive statistics
| Standard deviation | 38.430502 |
|---|---|
| Coefficient of variation (CV) | 0.35570075 |
| Kurtosis | -0.96890396 |
| Mean | 108.04167 |
| Median Absolute Deviation (MAD) | 36.5 |
| Skewness | -0.42343553 |
| Sum | 10372 |
| Variance | 1476.9035 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 150 | 25 | |
| 100 | 6 | 6.2% |
| 149 | 6 | 6.2% |
| 88 | 3 | 3.1% |
| 113 | 3 | 3.1% |
| 94 | 2 | 2.1% |
| 87 | 2 | 2.1% |
| 99 | 2 | 2.1% |
| 84 | 2 | 2.1% |
| 77 | 2 | 2.1% |
| Other values (39) | 43 |
| Value | Count | Frequency (%) |
| 24 | 1 | |
| 30 | 2 | |
| 31 | 1 | |
| 33 | 1 | |
| 38 | 1 | |
| 42 | 1 | |
| 43 | 1 | |
| 54 | 1 | |
| 55 | 1 | |
| 56 | 1 |
| Value | Count | Frequency (%) |
| 150 | 25 | |
| 149 | 6 | 6.2% |
| 144 | 1 | 1.0% |
| 143 | 1 | 1.0% |
| 140 | 1 | 1.0% |
| 135 | 1 | 1.0% |
| 134 | 1 | 1.0% |
| 133 | 1 | 1.0% |
| 132 | 1 | 1.0% |
| 129 | 1 | 1.0% |
GOODS_DESCRIPTION_len_chars_std
Real number (ℝ)
High correlation  Unique 
| Distinct | 96 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.463062 |
| Minimum | 5.3733348 |
|---|---|
| Maximum | 24.703237 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 5.3733348 |
|---|---|
| 5-th percentile | 8.7080778 |
| Q1 | 12.931953 |
| median | 15.331074 |
| Q3 | 18.055048 |
| 95-th percentile | 21.776572 |
| Maximum | 24.703237 |
| Range | 19.329902 |
| Interquartile range (IQR) | 5.1230948 |
Descriptive statistics
| Standard deviation | 4.0719599 |
|---|---|
| Coefficient of variation (CV) | 0.26333464 |
| Kurtosis | -0.25337489 |
| Mean | 15.463062 |
| Median Absolute Deviation (MAD) | 2.6248741 |
| Skewness | -0.10820759 |
| Sum | 1484.454 |
| Variance | 16.580858 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 22.96797199 | 1 | 1.0% |
| 15.8134861 | 1 | 1.0% |
| 14.28715009 | 1 | 1.0% |
| 17.72909639 | 1 | 1.0% |
| 17.25737594 | 1 | 1.0% |
| 15.50843253 | 1 | 1.0% |
| 14.71184673 | 1 | 1.0% |
| 17.93598698 | 1 | 1.0% |
| 16.23438024 | 1 | 1.0% |
| 15.03329405 | 1 | 1.0% |
| Other values (86) | 86 |
| Value | Count | Frequency (%) |
| 5.373334837 | 1 | |
| 5.802298395 | 1 | |
| 7.73489311 | 1 | |
| 7.823619398 | 1 | |
| 8.395766129 | 1 | |
| 8.812181651 | 1 | |
| 8.875311679 | 1 | |
| 9.275510106 | 1 | |
| 9.770037011 | 1 | |
| 10.19757878 | 1 |
| Value | Count | Frequency (%) |
| 24.70323701 | 1 | |
| 23.95720033 | 1 | |
| 22.96797199 | 1 | |
| 22.16979928 | 1 | |
| 21.823475 | 1 | |
| 21.76093802 | 1 | |
| 21.72689805 | 1 | |
| 21.10993586 | 1 | |
| 21.00291573 | 1 | |
| 20.95948574 | 1 |
subtokenization_indicator_sum
Real number (ℝ)
High correlation  Unique 
| Distinct | 96 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5431.5762 |
| Minimum | 9.4 |
|---|---|
| Maximum | 108947.02 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 9.4 |
|---|---|
| 5-th percentile | 19.829167 |
| Q1 | 178.6354 |
| median | 944.54691 |
| Q3 | 4088.8101 |
| 95-th percentile | 23025.952 |
| Maximum | 108947.02 |
| Range | 108937.62 |
| Interquartile range (IQR) | 3910.1747 |
Descriptive statistics
| Standard deviation | 14678.45 |
|---|---|
| Coefficient of variation (CV) | 2.7024291 |
| Kurtosis | 30.235703 |
| Mean | 5431.5762 |
| Median Absolute Deviation (MAD) | 908.71861 |
| Skewness | 5.16365 |
| Sum | 521431.31 |
| Variance | 2.1545689 × 108 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 181.7710127 | 1 | 1.0% |
| 846.6701687 | 1 | 1.0% |
| 623.302381 | 1 | 1.0% |
| 4614.795259 | 1 | 1.0% |
| 4282.989997 | 1 | 1.0% |
| 2994.08142 | 1 | 1.0% |
| 311.386039 | 1 | 1.0% |
| 250.2694784 | 1 | 1.0% |
| 1051.298301 | 1 | 1.0% |
| 3473.194787 | 1 | 1.0% |
| Other values (86) | 86 |
| Value | Count | Frequency (%) |
| 9.4 | 1 | |
| 9.821428571 | 1 | |
| 16 | 1 | |
| 16.16666667 | 1 | |
| 16.66666667 | 1 | |
| 20.88333333 | 1 | |
| 24.06666667 | 1 | |
| 26.79908009 | 1 | |
| 35.77564103 | 1 | |
| 35.88095238 | 1 |
| Value | Count | Frequency (%) |
| 108947.0216 | 1 | |
| 67054.18812 | 1 | |
| 58631.14629 | 1 | |
| 31715.65547 | 1 | |
| 23964.395 | 1 | |
| 22713.13714 | 1 | |
| 17189.63416 | 1 | |
| 14955.23916 | 1 | |
| 12475.90271 | 1 | |
| 9079.447805 | 1 |
subtokenization_indicator_min
Categorical
Constant 
| Distinct | 1 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.5 KiB |
| 1.0 |
|---|
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 1.0 |
| 3rd row | 1.0 |
| 4th row | 1.0 |
| 5th row | 1.0 |
Common Values
| Value | Count | Frequency (%) |
| 1.0 | 96 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1.0 | 96 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 96 | |
| . | 96 | |
| 0 | 96 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 288 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 96 | |
| . | 96 | |
| 0 | 96 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 288 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 96 | |
| . | 96 | |
| 0 | 96 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 288 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 96 | |
| . | 96 | |
| 0 | 96 |
subtokenization_indicator_mean
Real number (ℝ)
High correlation  Unique 
| Distinct | 96 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.8478004 |
| Minimum | 1.3064248 |
|---|---|
| Maximum | 2.8896593 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 1.3064248 |
|---|---|
| 5-th percentile | 1.4101278 |
| Q1 | 1.6112548 |
| median | 1.7729694 |
| Q3 | 2.0119592 |
| 95-th percentile | 2.3864634 |
| Maximum | 2.8896593 |
| Range | 1.5832345 |
| Interquartile range (IQR) | 0.40070441 |
Descriptive statistics
| Standard deviation | 0.31712636 |
|---|---|
| Coefficient of variation (CV) | 0.17162371 |
| Kurtosis | 1.3685078 |
| Mean | 1.8478004 |
| Median Absolute Deviation (MAD) | 0.19197106 |
| Skewness | 0.99435466 |
| Sum | 177.38884 |
| Variance | 0.10056913 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1.409077618 | 1 | 1.0% |
| 2.889659279 | 1 | 1.0% |
| 1.590057094 | 1 | 1.0% |
| 1.708550633 | 1 | 1.0% |
| 1.768369115 | 1 | 1.0% |
| 1.95308638 | 1 | 1.0% |
| 1.336420768 | 1 | 1.0% |
| 1.58398404 | 1 | 1.0% |
| 1.525832077 | 1 | 1.0% |
| 1.830888132 | 1 | 1.0% |
| Other values (86) | 86 |
| Value | Count | Frequency (%) |
| 1.306424792 | 1 | |
| 1.336420768 | 1 | |
| 1.385263685 | 1 | |
| 1.392222222 | 1 | |
| 1.409077618 | 1 | |
| 1.410477899 | 1 | |
| 1.458142488 | 1 | |
| 1.46969697 | 1 | |
| 1.485742862 | 1 | |
| 1.497962856 | 1 |
| Value | Count | Frequency (%) |
| 2.889659279 | 1 | |
| 2.887057102 | 1 | |
| 2.642158046 | 1 | |
| 2.606536797 | 1 | |
| 2.42121609 | 1 | |
| 2.374879196 | 1 | |
| 2.354018098 | 1 | |
| 2.327005013 | 1 | |
| 2.286300175 | 1 | |
| 2.28403324 | 1 |
subtokenization_indicator_median
Real number (ℝ)
High correlation 
| Distinct | 23 |
|---|---|
| Distinct (%) | 24.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.6203099 |
| Minimum | 1 |
|---|---|
| Maximum | 2.3333333 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1.2767857 |
| Q1 | 1.5 |
| median | 1.6 |
| Q3 | 1.75 |
| 95-th percentile | 2 |
| Maximum | 2.3333333 |
| Range | 1.3333333 |
| Interquartile range (IQR) | 0.25 |
Descriptive statistics
| Standard deviation | 0.25559668 |
|---|---|
| Coefficient of variation (CV) | 0.15774555 |
| Kurtosis | 0.64860803 |
| Mean | 1.6203099 |
| Median Absolute Deviation (MAD) | 0.1 |
| Skewness | 0.20703322 |
| Sum | 155.54975 |
| Variance | 0.065329661 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1.5 | 27 | |
| 2 | 12 | |
| 1.666666667 | 11 | |
| 1.75 | 8 | 8.3% |
| 1.6 | 7 | 7.3% |
| 1.333333333 | 6 | 6.2% |
| 1 | 3 | 3.1% |
| 1.833333333 | 3 | 3.1% |
| 1.416666667 | 2 | 2.1% |
| 1.625 | 2 | 2.1% |
| Other values (13) | 15 |
| Value | Count | Frequency (%) |
| 1 | 3 | 3.1% |
| 1.055555556 | 1 | 1.0% |
| 1.25 | 1 | 1.0% |
| 1.285714286 | 1 | 1.0% |
| 1.333333333 | 6 | 6.2% |
| 1.4 | 2 | 2.1% |
| 1.416666667 | 2 | 2.1% |
| 1.45 | 1 | 1.0% |
| 1.5 | 27 | |
| 1.571428571 | 1 | 1.0% |
| Value | Count | Frequency (%) |
| 2.333333333 | 1 | 1.0% |
| 2.25 | 1 | 1.0% |
| 2.2 | 1 | 1.0% |
| 2 | 12 | |
| 1.897368421 | 1 | 1.0% |
| 1.833333333 | 3 | 3.1% |
| 1.777777778 | 1 | 1.0% |
| 1.75 | 8 | |
| 1.714285714 | 2 | 2.1% |
| 1.666666667 | 11 |
subtokenization_indicator_max
Real number (ℝ)
High correlation 
| Distinct | 50 |
|---|---|
| Distinct (%) | 52.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 12.364193 |
| Minimum | 2 |
|---|---|
| Maximum | 59 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 2 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 5.5 |
| median | 9 |
| Q3 | 17 |
| 95-th percentile | 30.5 |
| Maximum | 59 |
| Range | 57 |
| Interquartile range (IQR) | 11.5 |
Descriptive statistics
| Standard deviation | 9.6923248 |
|---|---|
| Coefficient of variation (CV) | 0.78390276 |
| Kurtosis | 4.9267632 |
| Mean | 12.364193 |
| Median Absolute Deviation (MAD) | 4.8333333 |
| Skewness | 1.8228954 |
| Sum | 1186.9625 |
| Variance | 93.941161 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 8 | 7 | 7.3% |
| 3 | 6 | 6.2% |
| 9 | 6 | 6.2% |
| 4 | 5 | 5.2% |
| 11 | 4 | 4.2% |
| 7 | 3 | 3.1% |
| 10.5 | 3 | 3.1% |
| 16 | 3 | 3.1% |
| 25 | 3 | 3.1% |
| 17 | 3 | 3.1% |
| Other values (40) | 53 |
| Value | Count | Frequency (%) |
| 2 | 1 | 1.0% |
| 2.0625 | 1 | 1.0% |
| 2.5 | 2 | 2.1% |
| 3 | 6 | |
| 3.4 | 2 | 2.1% |
| 4 | 5 | |
| 4.333333333 | 1 | 1.0% |
| 4.4 | 1 | 1.0% |
| 4.5 | 1 | 1.0% |
| 4.666666667 | 1 | 1.0% |
| Value | Count | Frequency (%) |
| 59 | 1 | 1.0% |
| 38 | 1 | 1.0% |
| 36 | 1 | 1.0% |
| 33 | 1 | 1.0% |
| 32 | 1 | 1.0% |
| 30 | 1 | 1.0% |
| 27 | 1 | 1.0% |
| 26 | 1 | 1.0% |
| 25 | 3 | |
| 24 | 2 |
subtokenization_indicator_std
Real number (ℝ)
High correlation  Unique 
| Distinct | 96 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.0427778 |
| Minimum | 0.32026997 |
|---|---|
| Maximum | 4.756294 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 1.5 KiB |
Quantile statistics
| Minimum | 0.32026997 |
|---|---|
| 5-th percentile | 0.50750871 |
| Q1 | 0.69806242 |
| median | 0.88526552 |
| Q3 | 1.1212212 |
| 95-th percentile | 2.1402593 |
| Maximum | 4.756294 |
| Range | 4.436024 |
| Interquartile range (IQR) | 0.4231588 |
Descriptive statistics
| Standard deviation | 0.68524075 |
|---|---|
| Coefficient of variation (CV) | 0.65713016 |
| Kurtosis | 15.993194 |
| Mean | 1.0427778 |
| Median Absolute Deviation (MAD) | 0.20596233 |
| Skewness | 3.6080105 |
| Sum | 100.10667 |
| Variance | 0.46955488 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.4513244466 | 1 | 1.0% |
| 4.75629399 | 1 | 1.0% |
| 0.7150886604 | 1 | 1.0% |
| 0.8916367722 | 1 | 1.0% |
| 0.903294991 | 1 | 1.0% |
| 1.270069027 | 1 | 1.0% |
| 0.5005838218 | 1 | 1.0% |
| 0.6590552354 | 1 | 1.0% |
| 0.712196267 | 1 | 1.0% |
| 0.9500163379 | 1 | 1.0% |
| Other values (86) | 86 |
| Value | Count | Frequency (%) |
| 0.3202699703 | 1 | |
| 0.4014293246 | 1 | |
| 0.4513244466 | 1 | |
| 0.4798989793 | 1 | |
| 0.5005838218 | 1 | |
| 0.5098170071 | 1 | |
| 0.5297515291 | 1 | |
| 0.5630678951 | 1 | |
| 0.5822853742 | 1 | |
| 0.5870198238 | 1 |
| Value | Count | Frequency (%) |
| 4.75629399 | 1 | |
| 4.596942019 | 1 | |
| 3.042963297 | 1 | |
| 2.370812634 | 1 | |
| 2.321687516 | 1 | |
| 2.079783195 | 1 | |
| 1.99371846 | 1 | |
| 1.684930545 | 1 | |
| 1.491320463 | 1 | |
| 1.461329544 | 1 |
Interactions
Correlations
| GOODS_DESCRIPTION_len_chars_max | GOODS_DESCRIPTION_len_chars_mean | GOODS_DESCRIPTION_len_chars_median | GOODS_DESCRIPTION_len_chars_min | GOODS_DESCRIPTION_len_chars_std | GOODS_DESCRIPTION_len_chars_sum | GOODS_DESCRIPTION_len_words_max | GOODS_DESCRIPTION_len_words_mean | GOODS_DESCRIPTION_len_words_median | GOODS_DESCRIPTION_len_words_min | GOODS_DESCRIPTION_len_words_std | GOODS_DESCRIPTION_len_words_sum | HS06_count | subtokenization_indicator_max | subtokenization_indicator_mean | subtokenization_indicator_median | subtokenization_indicator_std | subtokenization_indicator_sum | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| GOODS_DESCRIPTION_len_chars_max | 1.000 | 0.509 | 0.384 | -0.613 | 0.579 | 0.816 | 0.949 | 0.507 | 0.503 | 0.206 | 0.581 | 0.812 | 0.813 | 0.591 | 0.151 | 0.097 | 0.247 | 0.800 |
| GOODS_DESCRIPTION_len_chars_mean | 0.509 | 1.000 | 0.904 | -0.306 | 0.648 | 0.565 | 0.555 | 0.950 | 0.763 | 0.000 | 0.750 | 0.556 | 0.503 | 0.516 | 0.504 | 0.437 | 0.405 | 0.541 |
| GOODS_DESCRIPTION_len_chars_median | 0.384 | 0.904 | 1.000 | -0.239 | 0.364 | 0.526 | 0.437 | 0.887 | 0.832 | 0.000 | 0.490 | 0.520 | 0.466 | 0.541 | 0.567 | 0.455 | 0.476 | 0.511 |
| GOODS_DESCRIPTION_len_chars_min | -0.613 | -0.306 | -0.239 | 1.000 | -0.306 | -0.707 | -0.608 | -0.332 | -0.393 | 0.760 | -0.326 | -0.709 | -0.711 | -0.544 | -0.204 | -0.163 | -0.251 | -0.697 |
| GOODS_DESCRIPTION_len_chars_std | 0.579 | 0.648 | 0.364 | -0.306 | 1.000 | 0.332 | 0.580 | 0.564 | 0.310 | 0.094 | 0.918 | 0.321 | 0.292 | 0.207 | 0.183 | 0.239 | 0.115 | 0.304 |
| GOODS_DESCRIPTION_len_chars_sum | 0.816 | 0.565 | 0.526 | -0.707 | 0.332 | 1.000 | 0.855 | 0.582 | 0.621 | 0.000 | 0.410 | 0.999 | 0.996 | 0.798 | 0.330 | 0.208 | 0.429 | 0.996 |
| GOODS_DESCRIPTION_len_words_max | 0.949 | 0.555 | 0.437 | -0.608 | 0.580 | 0.855 | 1.000 | 0.551 | 0.523 | 0.000 | 0.608 | 0.851 | 0.848 | 0.655 | 0.233 | 0.164 | 0.319 | 0.845 |
| GOODS_DESCRIPTION_len_words_mean | 0.507 | 0.950 | 0.887 | -0.332 | 0.564 | 0.582 | 0.551 | 1.000 | 0.837 | 0.000 | 0.722 | 0.583 | 0.524 | 0.520 | 0.442 | 0.381 | 0.350 | 0.557 |
| GOODS_DESCRIPTION_len_words_median | 0.503 | 0.763 | 0.832 | -0.393 | 0.310 | 0.621 | 0.523 | 0.837 | 1.000 | 0.136 | 0.432 | 0.623 | 0.580 | 0.608 | 0.425 | 0.300 | 0.402 | 0.603 |
| GOODS_DESCRIPTION_len_words_min | 0.206 | 0.000 | 0.000 | 0.760 | 0.094 | 0.000 | 0.000 | 0.000 | 0.136 | 1.000 | 0.203 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
| GOODS_DESCRIPTION_len_words_std | 0.581 | 0.750 | 0.490 | -0.326 | 0.918 | 0.410 | 0.608 | 0.722 | 0.432 | 0.203 | 1.000 | 0.405 | 0.362 | 0.315 | 0.243 | 0.257 | 0.184 | 0.385 |
| GOODS_DESCRIPTION_len_words_sum | 0.812 | 0.556 | 0.520 | -0.709 | 0.321 | 0.999 | 0.851 | 0.583 | 0.623 | 0.000 | 0.405 | 1.000 | 0.996 | 0.797 | 0.321 | 0.200 | 0.424 | 0.995 |
| HS06_count | 0.813 | 0.503 | 0.466 | -0.711 | 0.292 | 0.996 | 0.848 | 0.524 | 0.580 | 0.000 | 0.362 | 0.996 | 1.000 | 0.788 | 0.298 | 0.179 | 0.407 | 0.995 |
| subtokenization_indicator_max | 0.591 | 0.516 | 0.541 | -0.544 | 0.207 | 0.798 | 0.655 | 0.520 | 0.608 | 0.000 | 0.315 | 0.797 | 0.788 | 1.000 | 0.628 | 0.408 | 0.798 | 0.816 |
| subtokenization_indicator_mean | 0.151 | 0.504 | 0.567 | -0.204 | 0.183 | 0.330 | 0.233 | 0.442 | 0.425 | 0.000 | 0.243 | 0.321 | 0.298 | 0.628 | 1.000 | 0.898 | 0.829 | 0.372 |
| subtokenization_indicator_median | 0.097 | 0.437 | 0.455 | -0.163 | 0.239 | 0.208 | 0.164 | 0.381 | 0.300 | 0.000 | 0.257 | 0.200 | 0.179 | 0.408 | 0.898 | 1.000 | 0.597 | 0.248 |
| subtokenization_indicator_std | 0.247 | 0.405 | 0.476 | -0.251 | 0.115 | 0.429 | 0.319 | 0.350 | 0.402 | 0.000 | 0.184 | 0.424 | 0.407 | 0.798 | 0.829 | 0.597 | 1.000 | 0.461 |
| subtokenization_indicator_sum | 0.800 | 0.541 | 0.511 | -0.697 | 0.304 | 0.996 | 0.845 | 0.557 | 0.603 | 0.000 | 0.385 | 0.995 | 0.995 | 0.816 | 0.372 | 0.248 | 0.461 | 1.000 |
Missing values
Sample
| HS06_count | GOODS_DESCRIPTION_len_words_sum | GOODS_DESCRIPTION_len_words_min | GOODS_DESCRIPTION_len_words_mean | GOODS_DESCRIPTION_len_words_median | GOODS_DESCRIPTION_len_words_max | GOODS_DESCRIPTION_len_words_std | GOODS_DESCRIPTION_len_chars_sum | GOODS_DESCRIPTION_len_chars_min | GOODS_DESCRIPTION_len_chars_mean | GOODS_DESCRIPTION_len_chars_median | GOODS_DESCRIPTION_len_chars_max | GOODS_DESCRIPTION_len_chars_std | subtokenization_indicator_sum | subtokenization_indicator_min | subtokenization_indicator_mean | subtokenization_indicator_median | subtokenization_indicator_max | subtokenization_indicator_std | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| HS02 | |||||||||||||||||||
| 01 | 129 | 775 | 1 | 6.007752 | 5.0 | 20 | 3.859842 | 4417 | 4 | 34.240310 | 30.0 | 124 | 22.967972 | 181.771013 | 1.0 | 1.409078 | 1.285714 | 3.0 | 0.451324 |
| 02 | 293 | 1558 | 1 | 5.317406 | 5.0 | 14 | 3.047277 | 9857 | 4 | 33.641638 | 32.0 | 89 | 15.813486 | 846.670169 | 1.0 | 2.889659 | 1.666667 | 33.0 | 4.756294 |
| 03 | 62 | 182 | 1 | 2.935484 | 2.0 | 9 | 1.608084 | 1124 | 4 | 18.129032 | 14.5 | 54 | 10.857473 | 96.351587 | 1.0 | 1.554058 | 1.416667 | 3.4 | 0.608181 |
| 04 | 1233 | 6844 | 1 | 5.550689 | 5.0 | 20 | 3.574472 | 43568 | 4 | 35.334955 | 30.0 | 95 | 19.138776 | 3559.741407 | 1.0 | 2.887057 | 1.750000 | 36.0 | 4.596942 |
| 05 | 15 | 36 | 1 | 2.400000 | 2.0 | 5 | 1.055597 | 205 | 5 | 13.666667 | 14.0 | 30 | 5.802298 | 20.883333 | 1.0 | 1.392222 | 1.333333 | 2.0 | 0.401429 |
| 06 | 83 | 264 | 1 | 3.180723 | 3.0 | 8 | 1.531336 | 1928 | 4 | 23.228916 | 22.0 | 57 | 11.845886 | 169.228571 | 1.0 | 2.038898 | 2.000000 | 7.5 | 1.067709 |
| 07 | 264 | 813 | 1 | 3.079545 | 3.0 | 9 | 1.484174 | 4913 | 4 | 18.609848 | 17.0 | 69 | 8.875312 | 419.930159 | 1.0 | 1.590645 | 1.500000 | 5.5 | 0.681658 |
| 08 | 317 | 821 | 1 | 2.589905 | 2.0 | 10 | 1.414983 | 5250 | 4 | 16.561514 | 15.0 | 42 | 7.823619 | 601.512698 | 1.0 | 1.897516 | 1.666667 | 8.0 | 0.958388 |
| 09 | 559 | 1981 | 1 | 3.543828 | 3.0 | 12 | 2.087974 | 11397 | 4 | 20.388193 | 19.0 | 72 | 10.197579 | 957.351587 | 1.0 | 1.712615 | 1.500000 | 9.0 | 0.735113 |
| 10 | 285 | 1413 | 1 | 4.957895 | 4.0 | 11 | 2.556058 | 7969 | 4 | 27.961404 | 28.0 | 73 | 13.307651 | 454.455700 | 1.0 | 1.594581 | 1.500000 | 5.5 | 0.587020 |
| HS06_count | GOODS_DESCRIPTION_len_words_sum | GOODS_DESCRIPTION_len_words_min | GOODS_DESCRIPTION_len_words_mean | GOODS_DESCRIPTION_len_words_median | GOODS_DESCRIPTION_len_words_max | GOODS_DESCRIPTION_len_words_std | GOODS_DESCRIPTION_len_chars_sum | GOODS_DESCRIPTION_len_chars_min | GOODS_DESCRIPTION_len_chars_mean | GOODS_DESCRIPTION_len_chars_median | GOODS_DESCRIPTION_len_chars_max | GOODS_DESCRIPTION_len_chars_std | subtokenization_indicator_sum | subtokenization_indicator_min | subtokenization_indicator_mean | subtokenization_indicator_median | subtokenization_indicator_max | subtokenization_indicator_std | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| HS02 | |||||||||||||||||||
| 88 | 293 | 1143 | 1 | 3.901024 | 3.0 | 14 | 2.017161 | 7640 | 4 | 26.075085 | 23.0 | 88 | 14.133942 | 555.728968 | 1.0 | 1.896686 | 1.571429 | 6.200000 | 0.999233 |
| 89 | 98 | 384 | 1 | 3.918367 | 3.0 | 14 | 2.983368 | 2433 | 4 | 24.826531 | 18.0 | 96 | 21.109936 | 153.900794 | 1.0 | 1.570416 | 1.500000 | 4.666667 | 0.643925 |
| 90 | 11611 | 53313 | 1 | 4.591594 | 4.0 | 28 | 2.982281 | 364062 | 3 | 31.354922 | 27.0 | 150 | 19.635811 | 22713.137143 | 1.0 | 1.956174 | 1.750000 | 16.000000 | 1.021780 |
| 91 | 496 | 2162 | 1 | 4.358871 | 4.0 | 23 | 2.693847 | 13770 | 4 | 27.762097 | 25.0 | 134 | 17.199596 | 783.850425 | 1.0 | 1.580344 | 1.400000 | 7.000000 | 0.722901 |
| 92 | 317 | 1055 | 1 | 3.328076 | 3.0 | 13 | 2.118245 | 6798 | 4 | 21.444795 | 17.0 | 88 | 14.485947 | 439.128588 | 1.0 | 1.385264 | 1.000000 | 4.000000 | 0.582285 |
| 93 | 64 | 242 | 1 | 3.781250 | 3.0 | 12 | 2.675395 | 1578 | 4 | 24.656250 | 20.0 | 68 | 16.408688 | 130.598810 | 1.0 | 2.040606 | 1.750000 | 6.500000 | 1.173693 |
| 94 | 7921 | 33169 | 1 | 4.187476 | 3.0 | 32 | 2.764096 | 204300 | 3 | 25.792198 | 21.0 | 150 | 17.272204 | 12475.902712 | 1.0 | 1.575041 | 1.333333 | 16.000000 | 0.859632 |
| 95 | 4191 | 17044 | 1 | 4.066810 | 4.0 | 28 | 2.236884 | 100155 | 3 | 23.897638 | 21.0 | 150 | 12.980664 | 6111.075169 | 1.0 | 1.458142 | 1.333333 | 10.500000 | 0.631161 |
| 96 | 3091 | 12455 | 1 | 4.029440 | 3.0 | 22 | 2.460847 | 75242 | 3 | 24.342284 | 21.0 | 132 | 14.985277 | 5138.436948 | 1.0 | 1.662387 | 1.500000 | 10.500000 | 0.816961 |
| 97 | 94 | 296 | 1 | 3.148936 | 2.5 | 13 | 2.238267 | 1972 | 6 | 20.978723 | 17.0 | 74 | 13.922967 | 139.659829 | 1.0 | 1.485743 | 1.055556 | 4.000000 | 0.721784 |